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Abstract. We introduce the notion of an interpolating path on the set of probability measures on finite 
graphs. Using this notion, we first prove a displacement convexity property of entropy along such a 
path and derive Prekopa-Leindler type inequalities, a Talagrand transport-entropy inequality, certain 
HWI type as well as log-Sobolev type inequalities in discrete settings. To illustrate through examples, 
we apply our results to the complete graph and to the hypercube for which our results are optimal - by 
passing to the limit, we recover the classical log-Sobolev inequality for the standard Gaussian measure 
with the optimal constant. 



1. Introduction 

In recent years, Optimal Transport and its link with the Ricci curvature in Riemannian geometry 
attracted a considerable amount of attention. The extensive modern book by C. Villani ll55l is one of 
the main references on this topic. However, while a lot is now known in the Riemannian setting (and 
more generally in geodesic spaces), very little is known so far in discrete spaces (such as finite graphs 
or finite Markov chains), with the notable exception of some notions of (discrete) Ricci curvature 
proposed recently by several authors - unfortunately there is not yet a satisfactory (universally agreed 
upon) resolution even there - see Bonciocat-Sturm |6), Erbar-Maas Ifl2l . Hillion IPT71 . Joulin ET1 . 
Lin-Yau E8l . Maas Il30l . Mielke [361] , Ollivier ll37l . and recent works on the displacement convexity 
of entropy by Hillion fM, LehecHH and Leonard E71 . 

In particular, the notions of Transport inequalities, HWI inequalities, interpolating paths on the 
measure space, displacement convexity of entropy, are yet to be properly introduced, analyzed and 
understood in discrete spaces. This is the chief aim of the present paper, and of a companion paper 
fi31 . Due to its theoretical as well as applied appeal, this subject is at the intersection of many areas 
of Mathematics, such as Calculus of Variations, Probability Theory, Convex Geometry and Analysis, 
as well as Combinatorial Optimization. 

In order to present our results, let us first introduce some of the relevant notions in the continuous 
framework of geodesic spaces, see ll55l . 

A complete, separable, metric space (X, d) is said to be a geodesic space, if for all xo, x\ € X, there 
exists at least one path y; [0, 1] i-» X such that y(0) = xq,j{\) = x\ and 

d(y{s),y(f)) -\t- s\d{xQ,x\), Vs,t e [0, 1]. 

Such a path is then called a constant speed geodesic between xq and x\. 
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Then, for p > 1, let PpiX) be the set of Borel probability measures on X having a finite p-\h 
moment, namely 

Pp(X) := j// Borel probability measure : ^ d(x ,x) p p(dx) < +00 

where x () e X is arbitrary (P p (X) does not depend on the choice of the point x a ) and define the 
following L^-Wasserstein distance: for vo, vi e l P p {X), set 

(1.1) W p (v ,n):=( inf ( [f d(x,y)P dn(x,y) 

\ffen(v ,vi) [J J 

where H(vo, v\) is the set of couplings of vo and v\. 

The metric space (P p (X), W p ) is canonically associated to the original metric space (X, d). Namely, 
if p > 1, CP P (X), W p ) is geodesic if and only if (X, d) is geodesic, see 11521 . 

A remarkable and powerful fact is that, when X is a Riemannian manifold, one can relate the Ricci 
curvature of the space to the convexity of entropy along geodesies ||34l [8j |43] [29] [51] [54]]. More 
precisely, under the Bakry -Emery CD(K, 00) condition (see e.g. 0), namely if the space (X,d,p) is 
such that Ric + Hess V > K, where p(dx) = e~ V(x) dx, then one can prove that for all v , v\ e P 2 (.X) 
whose supports are included in the support of p, there exists a constant speed WVgeodesic {v f } fe [o,i] 
from vo to v\ such that 

(1.2) H(v t \p) < (1 - t)H(v \p) + tH( Vl \p) - - t)Wj(v Q , vi) V? e [0, 1], 

where H(v\p) denotes the relative entropy of v with respect to p. Equation (11.21) is known as the 
K-displacement convexity of the entropy. In fact, a converse statement also holds: if the entropy is 
^-displacement convex, then the Ricci curvature is bounded below by K. This equivalence was used 
as a guideline for the definition of the notion of curvature in geodesic spaces by Sturm-Lott-Villani 
in their celebrated works 112911521 l53l . 

Moreover, it is known that the ^-displacement convexity of the entropy is a very strong notion 
that implies many well-known inequalities in Convex Geometry and in Probability Theory, such 
as the Brunn-Minkowski inequality, the Prekopa-Leindler inequality, Talagrand's transport-entropy 
inequality, HWI inequality, log-Sobolev inequality etc., see ll55l . 

The question one would like to address is whether one can extend the above theory to discrete 
settings such as finite graphs, equipped with a set of probability measures on the vertices and with a 
natural graph distance. 

Let us mention two main obstructions. Firstly, WVgeodesics do not exist in discrete settings (the 
reader can verify this fact by considering two nearest neighbors x,y in the graph G = (V,E) and 
constructing a constant speed geodesic between the two Dirac measures 8 X , 5 y at the vertices x and 
y). On the other hand, the following Talagrand's transport-entropy inequality 

(1.3) W 2 2 {v ,p) < C H{v \p) , Vv e <P 2 (V) 

(for a suitable constant C > 0) does not hold in discrete settings unless p is a Dirac measure! From 
these simple observations we deduce that W2 is not well adapted either for defining the path {v f } f£ [o,i] 
or for measuring the defect/excess in the convexity of entropy in a discrete context. 

In this paper, our contribution is to introduce the notion of an interpolating path {v f } re [o,i] and of 
a weak transport cost T2 (that in a sense goes back to Marton |[3Tl[32l ). These will in turn help us 
derive the desired displacement convexity results on finite graphs. 
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Before presenting our results, we give a brief state of the art of the field (to the best of our knowl- 
edge). 

In ll38l . Ollivier and Villani prove that, on the hypercube Q„ = {0, 1}", for any probability measures 
vq,v\, there exists a probability measure V1/2 (concentrated on the set of mid-points, see ||38l for a 
precise definition) such that 

H(y 1/2 \/i) < ^H(v Q \/u) + ifl(vi|Ai) - ^^(v ,Vi), 
2 2 80« 

where /j = 1/2" is the uniform measure and W\ is defined with the Hamming distance. They observe 

that, this in turn implies some curved Brunn-Minkowski inequality on Q. n . The constant l/n encodes, 

in some sense, the discrete Ricci curvature of the hypercube in accordance with the various definitions 

of the discrete Ricci curvature (see above for references). 

In Ifl2l . Erbar and Maas introduce a pseudo Wasserstein distance "W 2 that corresponds to the 

geodesic distance on the set, f{£l n ), of probability measures on the hypercube Q. n , equipped with a 

Riemannian metric. (In fact, their construction is more general and applies to a wide class of Markov 

kernels on finite graphs.) This metric is such that the continuous time random walk on the graph 

becomes a gradient flow of the function H(-\fi). Moreover they prove, inter alia, that if {v,}, e [ ,i] is a 

geodesic from vo to v\, then 

H(v t \n) < (1 - t)H(v \n) + tH{v Y \p) - -t{\ - tyW\{vQ, vi) , W e [0, 1] , 

n 

where /u = 1/2" is the uniform measure. Independently, Mielke |36l also obtains similar results. As a 
consequence of their displacement convexity property, these authors derive versions of log-Sobolev, 
HWI and Talagrand's transport-entropy inequalities (involving 'Wi and W\ distances) with sharp 
constants. 

In a different direction (at the level of functional inequalities), besides the study of the log-Sobolev 
inequality which is somehow now classical (see e.g. ll46l [Q), Sammer and the last named author 
Il48ll47l studied Talagrand's inequality in discrete spaces, with W\ on the left hand side of (1 1 .3b - They 
also derived a discrete analogue of the Otto- Villani result ||39l : that a modified log-Sobolev inequality 
implies the Wi-type Talagrand inequality. Connected to this, a few years ago, following seminal 
work of Bobkov and Ledoux 0, several researchers independently realized that modified versions 
of logarithmic Sobolev inequalities helped capture refined information that was lost while working 
with the classic log-Sobolev inequality of Gross. In the discrete setting of finite Markov chains, 
one such modified log-Sobolev inequality has been instrumental in capturing the rate of convergence 
to equilibrium in the (relative) entropy sense, see e.g. Q, flTOl, (51, 03. OH, EH, EH. The 
current state of knowledge in identifying precise sufficient criteria to derive bounds on the entropy 
decay (or on the corresponding modified log-Sobolev constants) is unfortunately rather meagre. This 
is an independent motivation for our efforts at developing the discrete aspects of the displacement 
convexity property and related notions. 

Now we describe some of the main results of the present paper. At first, we shall introduce the 
notion of an interpolating path {vf } f£ [o,i], on the set of probability measures on graphs, between two 
arbitrary probability measures Vq,v\. In fact, we define a family of interpolating paths, depending on 
a parameter n e II(vo, vi), which is a coupling of vq, v\. The construction of this interpolating path is 
inspired by a certain binomial interpolation due to Johnson GUI , see also lfi71[T8l[T9l . In particular, 
we shall prove that such an interpolating path, for a properly chosen coupling n* - namely an optimal 
coupling for W\ - is actually a W\ constant speed geodesic: i.e. W\(V{ , v* s ) = \t - s\W\{vq,v\) for 
all s,t € [0, 1], with W\ defined with the graph distance d (see Proposition !2.5l below). Such a family 
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enjoys a tensorisation (see Lemma |2. 101 ) that is crucial in our derivation of the displacement convexity 
property on product of graphs. 

Indeed, we shall prove the following tensoring property of a displacement convexity of entropy 
along the interpolating path {vf}(g[o,i]- This is one of our main results (see below and Theorem I4.6I ). 
In order to state the result, we define here the notion of a quadratic cost, which we will elaborate on, 
in the later sections. 

Let G = ( V, E) be a (finite) connected, undirected graph, and let P(V) denote the set of probability 
measures on the vertex set V. Given two probability measures vo and v\ on V, let IT(vo,vi) denote 
the set of couplings (joint distributions) of vo and v\. Given n € H(vo, v\), consider the probability 
kernels p and p defined by 



n(x,y) = v (x)p(x,y) = vi(y)p(y,x), Vx,y € V, 



and set 



(1.4) 



h(F):=^ ^d(x,y)p(x,y) 

xeV\yeV 



h(x):=^ ^d(x,y)p(y,x) vi(y) . 

yeV y.xeV 

We say a graph G, equipped with the distance d and probability measure p e P(V), satisfies the 
displacement convexity property (of entropy), if there exists a C = C(G,d,p) > 0, so that for any 
vo, v\ € P(V), there exists a n € n(vo, vi) satisfying: 

#(v» < (1 - t)H(v Q \p) + tH(v x \p) -Ct(l- t)(I 2 (n) + I 2 {n)) , W e [0, 1]. 

The quantity hix) goes back to Marton QTl |32l in her definition of the following transport cost, 
we call weak transport cost: 

Wf(vo,vi):= inf I 2 (n) + inf l 2 {n) . 

;reII(vo,V|) ;ren(vo,vi) 

For more on this Wasserstein-type distance, see ifTTl 1331 |49l . The precise statement of our tensori- 
sation theorem is as follows. For a graph, by the graph distance between two vertices, we mean the 
length of a shortest path between the two vertices. 

Theorem 1.5. For i e {1, . . . ,«}, let p' be a probability measure on Gj - (Vj,Ej), with the graph 

distance dj. Assume also that for each i e {1 n] there is a constant Cj > such that for all 

probability measures vq, v\ on Vi, there exists n — n 1 e Il(vo, vi) such that it holds 

H(yi\p}) < (1 - t)H(y \f/) + tH(n\f/) - dt{\ - t)(I 2 (n) + I 2 (n)) V? e [0, 1]. 

Then the product probability measure p = p l ® ■ ■ ■ ® p' 1 defined on the Cartesian product G = 
G\ □ • • • □ G n (see below for a precise definition) verifies the following property: for all probability 
measures vo, v\ on V, there exists n - n (n ^ € Il(vo, vi) satisfying, 

ff(v» < (1 - t)H(v Q \fi) + tH( Vl \p) - Ct{\ - t){lf{n) + 4' l} (7r)) 

where C = min, C„ 

- ' ^ 



Vf € [0, 1], 



4'V):= Z Z Z d ^ 

xeViX--xV„ i=l VveViX-xV,, 



n{x,y) 
vo(x) 



vo(^), 
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and 

yeViX~xV„ i=l UeViX-xV„ lu 
(and with h{n) '■= l\ (tt) an J similarly for h(^))- 



vi(v). 



In particular, as a consequence of the above tensorisation theorem, we shall prove that, given two 
probability measures vo, v\ on the hypercube £1„ = {0, 1}", there exists a coupling n such that 

(1.6) H(f t \p) < (1 - f)#(volK> + ^(vilju) - ^(1 - OW 2 2 (vo,vi), V? e [0, 1] 

where p = 1 /2" is the uniform measure (but that could be any product of Bernoulli measures). 
As it is easy to see, the weak transport cost is weaker than W%, but stronger than W\. Moreover, 
W^Cvo, Vi) > -W? (vo, vi) (see below) so that (11.61) captures, in a sense, a discrete Ricci curvature of 
the hypercube (see f38l and references therein). 

As a by-product of the displacement convexity property above, we shall derive a series of conse- 
quences. More precisely, we shall first derive a so-called HWI inequality. 

Proposition 1.7. Let p be a probability measure on V n . Assume that p verifies the following dis- 
placement convexity inequality: there is some c > such that for any probability measures vq, Vi on 
V", there exists a coupling n € II(vo, Vi) such that 

ff(v» < (1 - t)H(v \p) + tH{ Vx \n) - ct{\ - t)(lf(n) + If {n)) Vt e [0, 1]. 

Then p verifies 

~ i2 



H(v \p) < H( Vl \p) + 



xeV" i=\ 



Z ( io § -77 - io § 77) vowV^v 



for the same n € II(vo, vi) as above, where Nj(x) is the set of neighbors of x in the i-th direction (see 
Proposition \5. l\f or a precise definition). 

On the hypercube, the latter implies the following log-Sobolev-type inequality (that can be seen as 
a reinforcement of a discrete modified log-Sobolev inequality (see Corollary 15.31) ): if p = 1/2", for 
any / : O,, — > (0, 00), it holds 

1 " 1 ~ 

Ent^tf) < - Z Z [ lo §/W - ^gf{o-i{x))tf{x)p(x) - -W 2 2 (fp\p), 

xeCl,, i=\ 

where o~ ; (x) = (x\, . . . , 1 - x,-, x ;+1 , . . . , x n ) is the vector x - (x\, . . . , x n ) with the i-th coordinate 
flipped, and the constant 1 /2 (in front of the Dirichlet form) is optimal. 

From this, by means of the Central Limit Theorem, the above reinforced modified log-Sobolev 
inequality actually leads to the usual logarithmic Sobolev inequality of Gross |[T6l for the standard 
Gaussian, with the optimal constant (see Corollary 15.51) . 

In a different direction, we also prove that the displacement convexity along the interpolating path 
{vf }fg[o,i] implies a discrete Prekopa-Leindler Inequality (Theorem 16.4b . which in turn, as in the con- 
tinuous setting, implies a logarithmic Sobolev inequality and a (weak) transport-entropy inequality 
of the Talagrand-type: 

Wl(v\p) < C H(v\p) , Vv 
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for a suitable constant C > 0. These implications and inequalities are studied in further detail - their 
various links with the concentration of measure phenomenon and with other functional inequalities - 
in the companion paper lfl5l . 

We may summarize the various implications that we prove in the following diagram: 



Prekopa-Leindler 



Displacement convexity 



HWI 



Modified log-Sob Weak transport 



log-Sob for the Gaussian 



In summary, our paper develops various theoretical objects of much current interest (the interpo- 
lating path {vf} re [o,i], the weak transport cost W2, the displacement convexity property and its conse- 
quences) in a discrete context. Our concrete examples include the complete graph and the hypercube. 
However, our theory applies to other graphs (not necessarily product type) that we will collect in a 
forthcoming paper. Also, we believe that our results open a wide class of new problems and new 
directions of investigation in Probability Theory, Convex Geometry and Analysis. 

Finally, we mention that, during the final preparation of this work, we learned that Erwan Hillion 
independently introduced the same kind of interpolating path, but between a Dirac at a fixed point 
o e G of the graph and any arbitrary measure (hence without coupling n), and derive some displace- 
ment convexity property Ifl8l along the interpolation. In IfTHl . the author also deals with the / • g 
decomposition introduced by Leonard E71 . 

Our presentation follows the following table of contents. 
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1.1. Notation. Throughout the paper we shall use the following notation. 

Graphs. G = (V,E) will denote a finite connected undirected graph with the vertex set V and the 
edge set E. For any two vertices x and y of G, x ~ y means that x and y are nearest neighbors (for the 
graph structure of G), i.e. (x,y) e E. We use d for the graph distance denned below. 

Given two graphs G\ = (V\,Ei), G2 = (yi,E%), with graph distances d\, respectively, we set 
G\ □ G2 - (Vi x V2, E\ □ £2) for the Cartesian product of the two graphs, equipped with the I 
distance d(x,y) - d\(x\, vi) + d%(x%, y%), for all x = (x\, X2), y - (y\,yi) e G\ x G2. More precisely, 
((x\,X2), Cvi,j2)) <= E\ □ E2 if either x\ - vi and X2 ~ y2, or x\ ~ vi and X2 - y2- The Cartesian 
product of G with itself will simply be denoted by G 2 , and more generally by G", for all n > 2. 

Paths and geodesies. A path y = (xo, xi, . . . , x n ) (of G) is an oriented sequence of vertices of G 
satisfying x,_i ~ x,- for any i = 1 . . . ,n. Such a path starts at xo and ends at x n and is said to be 
of length |y| = n. The graph distance d(x, y) between two vertices x, y € G is the minimal length 
of a path connecting x to y. Any path of length n = d{x, y) between x and y is called a geodesic 
between x and y. By construction, any geodesic is self-avoiding. We will denote by T(x, y) the set of 
all geodesies from x to y. 

We will say that a path y = (xo,xi, . . .,x„) crosses the vertex z e V, if there is some such 
that z = X£. In this case, we will write z £ y. Given z e V, we set C(z) = {(x,y) such thatz e 
y for some y € T(x,y)} for the set of couples such that some geodesic joining them goes through z. 
Conversely, if z belongs to some geodesic between x and y, we shall write z € |[x, yl and say that 
z is between x and y. Finally, for all x,y,z € V, we will denote by T(x, z,y), the set of geodesies 
y e T(x,y) such that z e y. This set is nonempty if and only if z £ Ix,y]|. 

Probability measures and couplings. We write f{V) for the set of probability measures on V. 
Given a probability measure v e !P(V) and a function /: V — > R, v(f) = Yj Z ev v (z)f{z) denotes the 
mean value of / with respect to v. We may also use the alternative notation v(f) = J f(x) v(dx) = 



where v «; p means that v is absolutely continuous with respect to jx, and ^ denotes the density of v 
with respect to /u. 



J f(x)dv(x) = jfdv. 



Let v,p € P(V); the relative entropy of v with respect to p is defined by 
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Given a density /: V — > (0, oo) with respect to a given probability measure p. (i.e. p(f) = 1), we 
shall use the following notation for the relative entropy of fp with respect to p: 

Ent^/) := H(fp\p) = J flog f dp. 

If/: V — > (0, oo) is no longer a density, then Ent^f/) := J flog(f /p(f)) dp. 

Given two graphs G\ = (V\,E\) and G2 = (^,£"2) and a probability measure p e P(V\ x V2) 
on the product, we disintegrate p as follows: let p 2 be the second marginal of p, i.e. p 2 {x2) = 
Z^ieVi M*i» *2) = M^i> x 2), for all ^2 £ Vz, and set // 1 (jci IJC2) so that 

(1.8) V-(x\,X2)= n\x2)n x (xi\x2), V(x 1 ,x 2 ) e V\ X V 2 , 

with the convention that p x (x\\x2) = if yu 2 fe) = 0. Equation (11.81 ) will be referred to as the 
disintegration formula of p. 

Recall that a coupling n of two probability measures p and v in P(V) is a probability measure on V 2 
so that p and v are its first and second marginals, respectively: i.e. n(x, V) = p(x) and n(V,y) = v(y), 
for all x,y € V. Given p, v € P(V), the set of all couplings of // and v will be denoted by n(yu, v). 

Moreover, given two probability measures // and v in P(V), we denote by P(jt/, v) the set of proba- 
bility kernelsj p such that 

K*Mx, y) ^ v(y) , VyeV. 

xeV 

By construction, given p e P(yu, v), one defines a coupling 7r € H(p, v) by setting n(x,y) = p(x)p(x,y), 
x,y e V. Conversely, given a coupling n e U(p, v), we canonically construct a kernel p e P(p, v) by 
setting p(x, y) = n(x, y) /p(x) when p(x) + and p(x, y) = otherwise. 

Warning 1: In the sequel, it will always be understood, although not explicitly stated, that p(x,y) = 
if p(x) - and similarly in the disintegration formula dl -8b - 

Warning 2: For convenience, we will use the French notation C k n := \fy = k ,("l k y for the binomial 
coefficients. 



2. A NOTION OF A PATH ON THE SET OF PROBABILITY MEASURES ON GRAPHS. 

The aim of this section is to define a class of paths between probability measures on graphs. As 
proved below, each path in this class is a geodesic, in the space of probability measures equipped with 
the Wasserstein distance W\ (see below). It satisfies a convenient differentiation property and also 
has the nice feature of allowing tensorisation. We shall end the section with some specific examples. 



2.1. Construction. Inspired by M201 . we will first construct an interpolating path between two Dirac 
measures 5 X and 6 y , for arbitrary x,y e V, on the set of probability measures P(V). Fix x, y € V and 
denote by Y the random variable that chooses uniformly at random a geodesic 7 in F(x,y). Also, for 
any t € [0,1], let N t ~ S(d(x,y), t) be a binomial variable of parameter d(x,y) and t, independent of 
T (observe that No = and N\ = d(x,y)). Then denote by X t = the random position on T after N t 
jumps starting from x. Finally, set v* ,y for the law of X t . 

By construction, v* ,y is clearly a path from 6 X to 6 y . Moreover, for all z € V, we have 

\f\x,y)\' 



^(z)= J] P(x ; = z|r = r , Z €r)P(r = r ,ze r )= £ c^/^\\ - t) d ^ 



yer(x,y) yeT(x.y) 



'We recall that p : V x V — > [0, 1] is a probability kernel if, for all x 6 V, J^ev P(X.y) = 1- 
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Therefore 



d ^ \F(x,y)\ 
For all z between x and y we observe that 

(2.1) ir(jc,z,y)i = ir(jc,z)|x|r(z,y)i, 

since there is a one to one correspondence between the sets of geodesies from x to z and from z to y, 
and the set of geodesies from x to y that cross the vertex z, just by gluing the path from x to z to the 
path from z to y, and by using that d(x,y) = d(x,z) + d(z,y). Therefore vf y takes the form 

(2.2) v?>(z) = cg ; y<«>a - o^ z) ^y' l^. 

Observe that, for any x, y € V and any t e (0, 1), v*' y = v^* . 

Remark 2.3. In the construction above of the interpolation v t ' y , the choice of the binomial random 
variable for the number N t of jumps might seem somewhat ad hoc; however, in Proposition \2.12\ 
below, we show that in fact the choice is necessary for v* to tensorise over a (Cartesian) product of 
graphs. 

Given the family {v^' y } xy , we can now construct a path from any measure vo € P(V) to any measure 
vi € f{V). Namely, given a coupling n e P(V x V) of vo and v\, we define 

(2.4) vf(-)= £ 7r(x,v)v;- v (-), We [0,1]. 

(x.y)eV 2 

By construction we have = vo and v n x - V\. Furthermore, observe that, if vo = S x and v\ - 8 y , 
then necessarily n = 8 X ® 8 y and thus = v t ' y . 

2.2. Geodesies for W\. Next we prove that, when n is well chosen, (vf) fe [o,i] is a geodesic from vo 
to v\ on the set of probability measures f(V) equipped with the Wasserstein L\ -distance W\. 
Given two probability measures p and v on P(V), recall that 

Wi(u,v)= inf [f d(x,y)jr(dxdy) = inf E[d(X,Y)] 

7reII(vo,vi) JJ X-fiJ-v 

The following result asserts that (vf )? £ [o,i] is actually a geodesic for W\ when n is an optimal 
coupling. 

Proposition 2.5. For any probability measures vq, v\ e P(V), it holds 

Wi(vf ,vf ) = \t- *|Wi(v ,vi) Vs,t € [0, 1] 

where n* is an optimal coupling in the definition of Wi(vQ, vi) and where vf is defined in (I2.4I ). 

Proof. Fix two probability measures v , v\ e P(V) and n* an optimal coupling in the definition of 
W\(vo, vi) (since P(V) is compact n* is well defined). For brevity, set v t :- . 
First, we claim that it is enough to prove that 

(2.6) Wi(y s , v t ) <(t- s)Wi(vo, vi), Vs, t e [0, 1] with s < t. 



Indeed, assume (12.6I ). then recalling that W\ is a distance (see e.g. 11551 ). by the triangle inequality we 
have 

Wi(v ,vi) < Wi(v ,v s ) + Wi(v s ,v t ) + Wi(v ? ,vi) < 5Wi(v ,vi) + (t - s)Wi(v , vi) + ?Wi(v ,vi) 
< Wi(vo,vi). 
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Hence, all the inequalities used above are actually equalities, which guarantees the conclusion of the 
proposition and hence the claim. 

Now, we prove (12.6b - Let (X, Y) be a random couple of law n*. Fix s < t, it sufhses to construct a 
random couple (X s ,X t ) with marginal laws v s and v t so that 

E[d(X s ,X t )] <{t- s)E[d(X, Y)] = {t- s)Wi(v ,vi). 

From the last observation, let us remark that such a couple (X s , X t ) will therefore realized 

E[d(X s ,X t )] = W l (v s ,v t ). 

Let ((£/*, V/)). > be an independent identically distributed sequence of random couples in {0, 1} , 

independent of X and Y. We chose the law of (U], Vp given by 

P((t/ S \ V, 1 ) - (0, 0)) - 1 - s, ¥((Ul V}) = (0, 1)) = 0, 

P((t/ S \ V}) = (1,0)) = t-s, ¥((Ul, V}) = (1, 1)) = t, 
so that U] and V} are Bernoulli random variables with respective parameters s and t, and we have 

EdtfJ-V/l) = (*-*). 
Given (X, 7) = (jc,y), with x,y e V, let (N s ,N t ) denote the random couple defined by 

d(x,y) d(x,y) 

N s = J] U\, N t =J] Vl 

i=l i=i 

Then the laws of N s and N t given (X, F) = (x,y) are respectively B(d(x, y), s) and B(d(x, y), t), the 
binomial distribution with parameters d(x,y), s and ? respectively. 

Finally, given (X, Y) = (x,y), with x,y € V, let T denote a random geodesic chosen uniformly in 
T(x,y), independently of the sequence {(U' s , V})) , and let X s = T^ s be the random position on Y 
after N s jumps and X t - r#, be the random position on T after ,/V, jumps. By definition, the law of 
X s and X t are respectively v s and v t and one has d(X s ,X t ) = \N S - N t \. Moreover, according to this 
construction, one has 

d(X,Y) d(X,Y) 



E[d(X s ,X t )] =E[|JV,-JV,|] 



!=1 



i=l 



i=l 



d(X,Y) 
i=l 



(t-s)E[d(X, Y)]. 



This completes the proof of (12.61 ) and Proposition [ 

2.3. Differentiation property. A second property of the path defined in (12.21 ) and (12.41 ) is the fol- 
lowing time differentiation property. 

For any z on a given geodesic y from x to y, if z ^ let y+(z) denotes the (unique) vertex on y at 
distance d(z,y) - 1 from y (and thus at distance d(x,z) + 1 from *), and similarly if z ^ x, let y_(z) 
denote the vertex on y at distance d(z, y) + l from y (and hence at distance d(x, z) - 1 from x). In other 
words, following the geodesic y from x toward y, y_(z) is the vertex just anterior to z, and y+(z) the 
vertex posterior to z. 

For any real function / on V, we also define two related notions of gradient along y: for all z e y, 

vj/(z) = /(r+(z))-/(z). 
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and for all z e y, z + x, 

v;/(z) = /(z)-/(r-fe))- 

By convention, we put V~f{x) = V+/(y) = 0, and V+/(z) = V"/(z) = 0, if z t y. Let V y / denote the 
following convex combination of these two gradients: 

Observe that, although not explicitly stated, V y depends on x and y. Finally, for all z € |[x,y]|, we 
define 

V X ,yf(z) = =-. 1 - = V V r /( Z ), 

and when z ILr,y]], we set V x v /(z) = 0. 

Proposition 2.7. For all function f: V —> R an J a// x,y € V 7 , z'f /ioWs 

As a direct consequence of the above differentiation property, we are able to give an explicit expres- 
sion of the derivative (with respect to time) of the relative entropy of V t with respect to an arbitrary 
reference measure. 

Corollary 2.8. Let vq, v\ and p be three probability measures on V. Assume that vq, v\ are absolutely 
continuous with respect to p. Then, for any coupling n € n(vo, vi), it holds 

-//(v»l f=0 = V log —— - log — — > d(x,y)— —n(x,y). 

dt f-$ \ fi(z) A*C*)/^ W(x,y)\ 

The proof of Corollary |2.8| can be found below, while some example applications will be given in 
the next subsection. In order to prove Proposition 12.71 we need some preparation. Recall that S(n, t) 
denotes a binomial variable of parameter n and t, and that, for any function h: [0,1,..., n] — > R, 
B(n,t)(h) = Y. n k=Q Kk)C k n t k {l - tf-K 



Lemma 2.9. Let n eN* and t € [0, 1]. For any function h: [0,1,. . .,n] — > R it holds 

d_ 

dt 



— S(n, t){h) = ^ KMk + 1) - h{k)){n - k) + (h(k) - hik - l))jfc] c£**(l - tf~ k , 

k=0 



with the convention that h(-l) - h(n + 1) = 0. 
Proof of Lemma |Z91 By differentiating in t, we have 

„ n n 

—S(n, t)(h) = ^ h{k)kC k /-\\ - t) n ~ k - ^ h(k)(n - k)C k /(l - t)' 1 ~ k ~ l . 

dt k=Q k=Q 

Now, using that 1 = t + (1 - i) and that kC k - (n - k + l)C k ~\ we get 

kC k /-\\ - tf~ k = kC k t k {\ - t) n ~ k + {n-k+ 1)C^¥ _1 (1 - t)"- k+l , 
with the convention that C" 1 = 0. Similarly, using that (n - k)C k = (k + 1)C* +1 , we have 
(n - k)C k /(l - t)' M = (n - k)C k /(l - t) n ~ k + (k + l)C* + ¥ +1 (l - 0""*" 1 - 
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Hence, 



^ 1 1 11 

-S(n, t)(h) = J] h(k)(n - k + l)C*- l t k -\l - t)' M - £ Kk)(n - k)C k /(l - t) n ~ k 

m fc=0 k=0 

n n 

+ Yj h(k)kC k /(l - ff- k - J] h(k)(k + l)C* + ¥ +1 (l - t)' M 

k=0 k=0 
n n 

= Yjyh{k + 1) - h(k))(n - k)C k /(l - tf~ k + ^(h(k) - h(k - l))kC k /{\ - tf~ k , 

k=0 k=0 

with the convention that h(-l) = h(n + 1) = 0. □ 

We were informed by E. Hillion that the above elementary lemma also appears in his thesis iTTTl . 
We are now in a position to prove Proposition 12.71 



Proof of Proposition 12. 71 Set n = d{x,y) and let T be a random variable uniformly distributed on 
T(x,y) and N t be a random variable with Binomial law S(n, t) independent of Y. By definition vf y is 
the law of X, = T^, ■ Using the independence, we have 



fif) = E [f{X t )] = J] h(k)C k /{\ - tf- k , 



k=Q 

with h(k) - E[f(TiJ], k = 0, 1 . . . , n. According to Lemma I2T91 we thus get 

TT/^W = Z mk h{k)){n ~ k) + {m ~ h{k ~ l))k] c " tk{l ~ t)n ~ k 

at k=Q 

= E [(h(N t + 1) - h(N t ))(n - N t ) + (h(N t ) - h(N t - l))N t ] 
= E [(f(T Nt+1 ) - f(T Nt ))d(T Nt ,y) + (f(T Nt ) - /(T^iMz, T Nt )] 
= E [(f(T + (X t )) - f(X t ))d(X t ,y) + (f(X t ) - f(T-(X t )))d(x,X t )] 
= E[d(x,y)V T f(X t )]. 

Finally, observe that the law of T knowing X, = z £ [x,yj is uniform on T(x,z,y)- Indeed, 
P(r = 7 ,X t = z) = P(r = 7, y Nt =z) = P(r = y,N t = d(x,z), zey)= ^7 v P ^ = d ^ z ^ 
On the other hand, 

„ ir(x, z, y)\ 

p(x, = Z ) = v*- y (z) = nN t = d( X ,z) Y^; 

\T(x,y)\ 

which proves the claim. By the definition of V r y f, it thus follows that 

jvf y (f) = d(x,y)vf y (V Xty f), 

which completes the proof. □ 

Proof of Corollary 12. 81 For simplicity, let F - log(vo/yu). Observe that, since vo and v\ are absolutely 
continuous with respect to //, so is vf . Now we observe that, since Jf y f( z ) = 0, by Proposition 
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|2.7| (recall that = vq and v* = 8 X by construction), 



zeV 



l,=o 



3 
<9r 



5 XJ 



= ^ 7r(x,v)<i(x,v)V A . v F(x). 

(x,y)eV 2 

By the definition of the gradient, for any y € T(x,y), it holds V y F(x) = V+F(x). Thus, by the 
definition of V^F, we get 



a 



7r(x, y)d(x, V) 



(*,y)eV 2 l r ( X ^)l rerfcy) 



2 V r f W- 



Now, observe that for (x, y) € V 2 given, it holds 

£ V+FQc) = ^ F{y + {x))-F{x) = Y l {F{z)-F{x))\T{x,z,y)\, 

yeT(x,y) y&T(x,y) z~x 



completing the proof. 



□ 



2.4. Tensoring property. In this section we prove that the path (v*' } )te[o,i] constructed in Section |2~T1 
does tensorise. This will appear to be crucial in deriving the displacement convexity of the entropy on 
product spaces. Moreover we shall prove that, in order to have this tensoring property, the law of the 
random variable N t introduced in the construction of the path (v*' v ) fe [o,i], must be, modulo a change 
of time, a binomial (see Proposition 12. 12l below). The tensoring property of the path (v*^)te[Q,i] is the 
following. 

Lemma 2.10. LetG\ - (V\,E\), G 2 = (V 2 , E 2 ) be two graphs and let G = G\\3G 2 be their Cartesian 
product. Then, for any x = (xi,x 2 ), y = (yi,y 2 ) and z. = (zi,Zz) in V\ x V%, 

Proof. Fix x = (x\,x 2 ), y = (y\,y 2 ) and z = (zi,Z2) in V\ x V 2 . Then, we observe that, given two 
geodesies, one from x\ to y\, and one from x 2 to y 2 , one can construct exactly cjf 1 ' yi ' > different 
geodesies from x to y (by choosing the d{x\,y\) positions where to change the first coordinate, ac- 
cording to the geodesic joining x\ to y\, and thus changing the second coordinate in the remaining 
d(x 2 ,y 2 ) - d{x,y) - d{x\,y\) positions, according to the geodesic joining X2 to y-i). This construction 
exhausts all the geodesies from x to y. Hence, 

(2.11) \T(x,y)\ = Cf*]f\n xi , yi )\ x \T(x 2 ,y 2 )\. 

Observe also that z belongs to some geodesic from x to y if and only if z\ and z 2 belong respectively 
to some geodesic from x\ to y\, and from x 2 to y 2 . Therefore, by (12.11 ). it follows that 

\T(x,z,y)\ = C^\fc d £f\T{x u zuyi)\ x \T(x 2 ,z 2 ,y2)\. 
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So, it holds that 

x,y M _ r d(x, z ) d(*,z) n _ fyd(y, z ) \T(x,z,y)\ 

y-,d(x,z) pd(x\ ,z\)pd(y\ ,z\) 

_ { -d(x,y) { -d(x,z) ^d(y,z) d(x t , Zl ) n f W(n ,Z! ) Q*l ; Z l ' ?1 )l t d(x 2 ,Z2) n fyd(y,,z,) \ T ( x 2,Z2,y2)\ 

cf l f> \nxr,yx)\ { } |r(x 2 ,y 2 )| 

d(x,y) 

where we used that d(x, z) = d{x\ ,z\) + d(x2, Zi), and similarly for d(y, z), and the fact (that the reader 
can easily verify) that 

£d(x,z) Qd{x\ ,z\)Qd{y\ ,zi ) 
d(x,y) d(x,z) d(y,z) _ ^d(xy,z,\) Qd(x t ,zi) 
(-dOhJi) ~ d(x\,y\) d(x2,y 2 )' 

^d(x,y) 

□ 

Proposition 2.12. In the construction of v x ' y , t € [0, 1], use a general random variable jV^*'^ e 
{0, 1, . . . ,d{x,y)}, of parameter d(x,y) and t, that satisfies a.s. iVg = and N^ x ' y ^ - d(x,y) 
( instead of the Binomial, observe that this condition is here to ensure that Vq = 6 X and v x ' y = 8 y , 
namely that v x ' y is still an interpolation between the two Dirac measures) , so that 

\T(x,z,y)\ 



v x < y (z) = F(Nf x < y) =d(x,z)) 



\T(x,y)\ 

Let G\ = (V\,E\), Gi_ - (^,£2) be two graphs and let G = G\ □ G2 be their Cartesian product. 
Assume that for any x = (xi,x 2 ), y = (yi,yz) and z = (21,22) in V\ x V2, 

y x t '\z) = v x t uy \zi)vT' y \z2) We [0,1]. 

Then, there exists a function a: [0,1] — > [0,1] with a(0) = 0, a(l) = 1, such that N^ x ' y ^ ~ 
S(a(t),d(x,y)). 

Proof. Following the proof of Lemma l2.10l we have, 

\ r(x,z,y)\ 
\T(x,y)\ 



v x ' y (z) = ¥(Nf x - y) ^d(x,z)) 



r d{x u zi) r d(yi,zi) 

^d(x,z) ^d(y,z) n Ujd(x,y) j, ,\ W(.X\,Z\,yi)\ W{X2,Z2,y2)\ 



-F(Nf x < y) =d(x,z)) 



C d } xuyx) K ' ' W{xuyi)\ W{x 2 ,yi)\ 

d(x,y) 



On the other hand, 



and 



\T{xi,z\,y\)\ 
W{xi,yi)\ 

W{xi,zi,yi)\ 

V(X2,y2)\ 

Hence, the identity v x,y {z) - v x t uyi {z\)v X2,y2 {z2) ensures that 



v?' y \zi) = F(N? (xuyi) =d( Xl ,zij) 

^-(Z2) = F(N? X ^ ) =d(x 2 ,Z2)) 



£d{xi,z\)£d(yi,zi) 
d{x,z) d(y,z) 

w*(*i,yi) 
^d(x,y) 



F(N? X ' y) = d(x, Z )) = P« (V " V1) = d{x u Z,))v(N? X ^ = d{x 2 ,Z2)) 



for any zi e Ixuyii, Z2 e lx 2 ,y2§- 
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Now, observe that 

pd{x\ ,z\ ) f,d(yi ,zi) (-id{x\,z\) (~<d(x2,zi) 
^d(x,z) ^d(y,z) _ ^d(x uyi )^d(x 2 , yi ) 

^d(x,y) ^d(x,y) 

Hence, the latter can be rewritten as 

P« ( ^ = d{x,z)) _ v(Nf^ yi) = d(x u zij) ¥(Nf X2 ^ = d(x 2 ,z 2 )) 

fid{x,z) fid(_x\,zi) X ( ^d(x 2 ,Z2) 

^d(x,y) ^d(x u yi) ^d(x 2 ,y 2 ) 

Set, for simplicity, for any n, k, < k < n 

! (A^ = k) 



Pn,k - z 



C k 



Notice that p n ^ depends also on t, while not explicitly stated. We end up with the following induction 
formula 

(2-13) p„ t k - Pn u k\ • Pn-mJe-ki 

for any integers k\,n\,k,n satisfying the following conditions 

k,n\<n, k\ <mm{k,n{), and n\—k\<n — k. 

(We set, n = d(x,y), n\ = d(x\, y\), k = d(x,z) and k\ = d(x\,zi)). 
The special choice n\ = l,k\ = leads to 

(2.14) Pn,k = P\,Q • Pn-\,k- 

Hence, it cannot be that p\ o = (otherwise we would have p n ^ - for any k > 0, any n > 1, which 
clearly is impossible since 2?_ C k p„^ = 1). 
Set b - b(t) - pi fl. From (12.141 ) we deduce that 

Pn,k = b H ~ k p k , k . 

Finally, the special choice n = k, n\ = k\ = k — 1, in (12. 13b . ensures that 

Pk,k - Pk-\,k-\ ■ P\,\- 

Since p\ o + p\ \ = 1, the latter reads as 



Pk,k=p\ \ =(l-b) k . 



It follows that 

Now set a(t) = 1 - b(t) to end up with 



p n k = b n ~ k (\ - bf Vn, Vk < n. 



P(JV? = k) = C k a k (l - a)"- k , 

which guarantees that Nf x ' y ' > is indeed a binomial variable of parameter a(t) and d(x, y). 

To end the proof, it is suffices to observe that N^'^ - implies a(0) = 0, and that #p = d(x, y) 
implies a(l) = 1. □ 

2.5. Examples. In this section we collect some elementary facts on specific examples. Namely we 
give explicit expressions of vf y , and derive some properties, when available, on the complete graph, 
the two-point space, and the hypercube. 
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2.5.1 . Complete graph K n . Let K n be the complete graph with n vertices. Then, given any two points 
x, y € K„, there exists only one geodesic from x to y, namely T(x,y) = {(x,y)}. Hence, by construction 
of v*' y , we have 

(2.15) v*' y (z) = Vz * x,y; v? y (x) = l-t, and v*' y (y) = t. 

Therefore, for any coupling n with marginals vo and v\ (two given probability measures on K n ), we 
have for any z e K n , 

(x,y)eC(z) yeK„ xeK„ 

= (1 - /) J] 7r fc>') + ? Z = (1 ~ f)V ° (z) + ?V1(Z) - 

ve/f„ xeK„ 

As a conclusion, on the complete graph, vf is a simple linear combination of vq and v\ that does not 
depend on n. 

Moreover, under the assumption of Corollary 12. 8 [ since d(x,y) = \T(x,y)\ = \T(z,y)\ = 1, we have 



-H{V t \p\__ Q = £ Z (log/(z) ~ ^gf(x))n(x,z) = J] log/(z)vi(z) - £ f(x)log f(x)p(x) 



dt 

where we set for simplicity / = vo/p. On the other hand, since / is a density with respect to p, 
-G M (f, log/) := ~ 2 (log/(z) - log /(*))(/(*) - f(x))p(x)p(z) 

x,zeK„ 

Hence, if vi = jU = l/n is the uniform measure on K n (notice all the measures on K n are then 
absolutely continuous with respect to p), we can conclude that 

(2-16) -//(v>) !(=0 = -£,,(/, log/). 

Note that, when p = l/n, corresponds to the Dirichlet form associated to the uniform chain on the 
complete graph (each point can jumps to each point with probability l/n). 

As a summary, on the complete graph we have: For any coupling n, for any t e [0, 1], 

v? = (1 -f)v + fvi. 

For vi = p = l/n and / - vo/p, it holds 

jHtf\p\__ = -Gtf, log/). 

2.5.2. The two-point space. The previous computations apply in particular to the two-point space 
{0, 1}. In this specific case, let us consider p to be a Bernoulli(p) measure {i.e. p(l) — p = 1 - q - 
1 - p(0)). As above, V/ = (1 - t)vo + tv\, for any coupling n of v and v\. Moreover, it can also be 
checked by an easy computation that, for any t e [0, 1], 

W H ^' iM0} + ,cLm-,c } > - 4c2 - 
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where C = vi(0) - vo(0), and ||vo - v\\\jv = |vi(0) - vo(0)|. As a result, one arrives at the following 
displacement convexity of the entropy of on the two-point space: 

(2.17) tf(v» < (1 - t)H{v Q \n) + tH{v x \n) - 2/(1 - t)||v - v x \\ 2 TV , t e [0, 1] . 

In Section @] below, we refine the above inequality further, and generalize in two ways - by deriving 
displacement convexity of entropy on the complete graph and the ^-dimensional hypercube. 

As an application, let us set v\ = fx, and use / = vo/p for the density; taking the limit t — > 0, and 
using 

-H(^\p\__ = -^(/(l)-/(0))(log/(l)-log/(0)) =: -^(/Jog/), 

we get a reinforced modified logarithmic Sobolev inequality on the two-point space of the following 
type: 

(2-18) Ent„(/) < S M (f, log/) - 2\\fp-p\\ 2 TV . 

In the above, S M (f, log/) corresponds to the Dirichlet form associated with the Markov chain jump- 
ing from to 1 with probability p and from 1 to with probability q. The inequality is a reinforce- 
ment of a modified log-Sobolev inequality, considered by previous researchers (as mentioned in the 
introduction), which lacks the negative term. Similarly to ( 12.171 ), we also refine ( 12.181 ) further in 
Proposition 15.121 

2.5.3. The n-dimensional hypercube Q.,,. Consider the ^-dimensional hypercube Q.,, = {0, 1}" whose 
edges consist of pairs of vertices p that differ in precisely one coordinate. The graph distance here 
coincides with the Hamming distance: 

n 

d{x,y) = ^ 1*,^,, x,y e Q n . 

Then, one observes that |T(jc, y)\ = d(x,y)\ (since, in order to move from x to y in the shortest way, 
one just needs to choose, among d(x,y) coordinates where x and y differ, the order of the flips {i.e. 
moves from to 1 - X/)). It follows from (12.21 ) that, as soon as z belongs to a geodesic from x to y, 

vf y (z) = Cf' z \t d(x ~\l - = t d ^\\ - t) d(y *\ 

d(x ' y) d(x, y) ! 

and v x,y {z) - if z does not belong to a geodesic from x to y. 

This expression can be recovered using the tensorisation property above. Namely, observe that 
Equation (12.151 ) can be rewritten for the two-point space as follows, for all coordinates: 

v f yt {zi) = t^izd^a - tf»#\ 

Hence, by Lemma [2. 101 

v^( z ) = f]v^fe) = ^' z) (l-^ ) , 

;=1 

as soon as z belongs to a geodesic from x to y, and otherwise. Observe that the latter can also be 
rewritten in terms of a product of probability measures on the fibers as 

(2.19) ^=^((1-^+^). 

Given two probability measures on f2„, and a coupling n on we can finally define 

vf(z)= J] ^X\-i) d{ *Mx,i). 
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On the ^-dimensional hypercube we have: for any couple (x,y) e Q 2 , and for any t e [0, 1], 



V 



^ _ ffy*) Sl = ®» =l( (i _ ^ + t6 yt ). 



3. Weak transport cost 



In this section we recall a notion of a discrete Wasserstein-type distance, called weak transport cost 
- introduced and studied in |[3Tll50ll , developed further in [15] - and collect some useful facts from 
iPTSIl . Also, we introduce the notion of a Knothe-Rosenblatt coupling which will play a crucial role in 
the displacement convexity of the entropy property on product spaces. 

3.1. Definition and first properties. For the notion of a weak transport cost, first recall the definition 
of P(vq, vi) introduced in Section fTTTl 



Definition 3.1. Let vq,v\ e P(V). Then, the weak transport cost 72(vi|vo) between vo and v\ is 
defined as 

I \ 2 



T 2 (vi|v ):= inf Y V d(x,y)p(x,y) 

peP(v ,vi) ^—i ^—i 



xeV \yeV 



Vq(x). 



It can be shown that 

(v ,vi) i-> ^T 2 (vi|v ) + ^/t 2 (vo|vi) 

is a distance on P(V), see 031 . 

Also recall from the introduction, the following notation: given n e n(vo, vi), consider the kernels 
p e P(vq, vi) and p e P(v\,vo) defined by n(x,y) = v (x)p(x,y) = v\(y)p(y, x) and set 

t ^ 2 



(3.2) 



h(n):=^ ^d(x,y)p{x,y) 

xeV\yeV 



vo(x), 



and 



With this notation, 
Also, define 



Un) := Yj 

yeV 



^d(x,y)p(y,x) 

V.veV 



vi(y), 



^2 

Y^d(x,y)n(x,y) 

{xeV yzV 



T 2 (v Q \vi) = inf I 2 (n). 

ffen(v ,vi) 



T2(vo,vi):= inf J 2 (n), 

7reYl(v ,vi) 



and observe that 7~2(vo, vi) = Wj (vo, vi) where W\ is the usual Li-Wasserstein distance associated to 
the distance d. 

When vo and v\ are absolutely continuous with respect to some probability measure p, and d is the 
Hamming distance d(x,y) = \ x ± y , x,y € V, the weak transport cost and the Li-Wasserstein distance 
take an explicit form. This is stated in the next lemma. We give the proof for completeness. 
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Lemma 3.3 ( 11151 ). Assume that vq, v\ e P(V) are absolutely continuous with respect to a third 
probability measure p € P(V), with respective densities fo and f\. Assume that d(x,y) - l x ^ v , 
x, y e V. Then it holds 

fodp 



JO 



7" 2 (vi|v ) = J 
where [X] + - max(X, 0), and 

V^2(v ,vi) = J [fo~fi] + dp = ^J \fo-fi\dp. 

with || • \\jv, the total variation norm. 

Remark 3.4. Observe that T"2(vi|vo) does not depend on p. 

Proof. For any n e II(vo, vi) and any x e V, one has 

1 - 2^d(x,y)p(x,y) = 



-||v - villry 



n(x, x) ^ mm(v${x), v\(x)) 



yeV 



and therefore 



vo(x) 

Mx) 



voU) 



mm 



Mx) 



.1 ■ 



/oto 



yeV 



By integrating with respect to the measure vo and then optimizing over all n e n(vo, vi), it follows 
that 



J [/o-/i] + dp< yfr^n), 



and 



2 



fodp < T 2 (vi\vq) 
The equality is reached choosing n* € Il(vo, vi) defined by 
(3.5) n*(x,y) = v (x)p*(x,y) = t x=y rmn(v (x), v x (x)) + t m 

since Y, yeV d{x,y)p* {x,y) = [l - . 



[v (x) - vi(x)] + [vi(y) - v (y)] H 
Z z eVtVl(z) - Vo(z)]+ 



3.2. The Knothe-Rosenblatt coupling. In this subsection, we recall a general method, due to Kno- 
the-Rosenblatt E2l l45l . enabling to construct couplings between probability measures on product 
spaces. 

Consider two graphs G\ - (V\,Ei) and G 2 = (V 2 ,E 2 ) and two probability measures vo,Vi e 
P(V\ X V 2 ). The disintegration formulas of vq, V\ (recall (11.81 )) read 



(3.6) 



vo(xi,x 2 ) = vl(x 2 )vl(xi\x 2 ) and vi(y\,y 2 ) -v\(y2)v\{y\\y 2 ). 



Let7r 2 € P (Vj) be a coupling of v^, Vj, and for all (x 2 , y 2 ) e l et 71-1 ( " l x 2>3 ; 2) <= P(V^) be a coupling 
of Vq(-|x2) an( l v}( - ty2)> *2»y2 £ V2. We are now in a position to define the Knothe-Rosenblatt 
coupling. 
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Definition 3.7 (Knothe-Rosenblatt coupling). Let vq, v\ e f{V\ X V2), and consider a family of 
couplings n 2 , - \x2,y2))x 2 ,y2 as above; the coupling ft € !P([Vi x V2] 2 ), defined by 

7t((xi,x 2 ),(y\,y2)) ■= 7i 2 {x2,y2W{x]_,yi\x 2 ,y2) , (xi,x 2 ),Cvi,j2) eV l xV 2 
is called the Knothe-Rosenblatt coupling ofvo, v\ associated with the family of couplings 

[n 2 ,{n\- \X2,y2)}x 2 ,y 2 } ■ 

It is easy to check that the Knothe-Rosenblatt coupling is indeed a coupling of vo, v\. Note that it 
is usually required that the couplings n 2 , {n l ( ■ \x2,y2)}x 2 ,y2 are optimal for some weak transport cost, 
but we will not make this assumption in what follows. 

The preceding construction can easily be generalized to products of n graphs. Consider n graphs 
G\ = {V\,E\), . . . ,G n = (V n ,E n ), and two probability measures vo, vi e f{V\ x • • • x V n ) admitting 
the following disintegration formulas: for all x = (xi, . . . , x n ),y - {y\, . . . ,y n ) e V\ X • • • X V„, 

vq{x) = v%(x n )v%~\x n - l \x n )V( j ~ 2 (x ll -2\x ll -i, x n ) ■ ■ ■ vl{xi\x 2 , x n ), 

v\(y) = v^(y„)v^ -1 Cy n _i |yn)v^ _2 (j'n-2l>'n-i , y») • • • v}(yilj 2 , • • ■ ,y n )- 

For all j = I,..., n, let n J \ ■ \x j+l , x n ,y j+l , y n ) e P(V 2 ) be a coupling of v£( • \x j+u ...,x n ) 

and vj( • \yj+i, . . . ,y n )- The Knothe-Rosenblatt coupling ft e f{{V\ x • • • x V n ] 2 ) between vo and v\ is 
then defined by 

ft(x, y) = K n (x n ,y n )n"~ l (x„. l ,y n .i\x n ,y n ) ■ ■ ■ n\x { ,y { \x 2 , ...,x n ,y 2 ,.. .,y n ), 
for all x-{x\,x 2 ,..., x n ) and y = (y\,y 2 ,-. -,yn)- 

3.3. Tensorisation. Another useful property of the weak transport cost defined above is that it ten- 
sorises in the following sense. For 1 < i < n, let G, = (V,-, Ej) be a graph with the associated distance 
dj. Given two probability measures vo, v\ in P(V\ x • • • x V n ), define 

„ ( \ 2 



7f(vifo>):= inf V V V 

' xeV,x-xV„ i=l V.veVix-xV„ 



di(xi,yi)p(x,y) 



vo(x) 



where x = (x u ..., x n ),y = (y u . . . ,y n ) e V\ X • • • X V n . 

As above, for any coupling n of vo, v\ e f{ V\ x • • • x V n ) we also define 

n ( ^ 

4" ) ( 7r ) : = Zj XI di(xi,yi)p(x,y) v Q (x) 

xeVix--xV„ i'=l \yeVix--xV„ 

where p is such that n{x,y) - vq(x)p(x, y), for all x,y 6 V\ X • • • X V n . Similarly, one defines I 2 . 
We also define 

" ( f 

4 B) (7r) := £ d i( x i>yiMx,y) 



and 



i=l \x,yeViX—xV„ 



tf\v Q ,v x ):= inf jf{n). 

;ren(vo,vi) 



Using the notation of Section |3T2l above, we can state the result. 
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Proposition 3.8. Let vq, v\ in P(V\ x • • • x V n ); and consider a family of couplings n n e Ii(Vq, v") 

and n k ( ■ \x k+1 , . . . , x n ) € Il(v*j( • \x k+1 x n ), v k ( ■ \y k+ \ y n )) with (x 2 , x n ), (y 2 , . . . , y n ) e V 2 x 

• • • x V n , as above. Then, 



n-\ 



T^\n) < hW 1 ) + J] n{x,y)h{^{-\x k+ u...,x n ,y k+ x...y n )). 

k=\ x,y€ViX— xV„ 



where ft is the Knothe-Rosenblatt coupling ofvQ and v\ associated with the family of couplings above, 
and J j 



The same holds for Tf' and ji"\n). 



In particular, if the couplings n n and n k (- \x k +i, ■ ■ ■ ,x n ) are assumed to achieve the infimum in 
the definition of the weak transport costs between v[, and v" and between v£( • \x k +\, . . . , x n ) and 
v\( ' \yk+i, ■ ■ ■ ,yn) for all £ e {1, ... ,n - 1}, we immediately get the following tensorisation inequality 
forTz: 



ra-1 



0.9) r^(vi|y ) < t 2 (v"X) + Z Z KX'yYrMttxM* ■ ■ - .^Ivgofot+i.. ■ ■ .?»))■ 



k=\ 

ViX.~x.Vn 



In an obvious way, the same kind of conclusion holds replacing T 2 by T 2 . 

Proof. In this proof, we will use the following shorthand notation: if x e V and if 1 < i < j < n, we 
will denote by Xj-j the subvector (xj, xt + \, . . . ,xf) e V, X • • • X Vj. 

Define the kernels p( • , • ), p"( • , • ) and p k ( ■ , • \xk+hn>yk+hn) by the formulas 

ft(x,y) = p(x,y)v Q (x) 
n (x k ,y k \x k+l ) = p k {x k ,y k \x k+ \ 

:n > yk+ 1 :n 

)v^(x k \x k+ i M ), V& < n, 

7T n (x n ,y n ) = p n {x n ,y n )v n {x n ). 



By the definition of the Knothe-Rosenblatt coupling ft, it holds 

n-1 

P(*,y) = Y\ P k ( x k,yk\Xk+i:n,yk+V.n) X p n {x n ,y n ). 

As a result, 

^ di{xi,yi)p{x,y) - ^ ^-fey,) ]~~[ y fc |xfc + i : „, yt+i y„) 

/ V .Vi:h ft=l 

11-1 ( 

< ^ ] J p k {x k ,y k \Xk+\:n,yk+\:n)p n {Xn,yn) '^jdi{Xi,yi)pXxi,yi\Xi+\; n ,yi+\; n ) 



k=\ 



v y 



n-\ 



yi+l:n k=i+l 



\2 
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where the inequality comes from Jensen's inequality. Therefore, 
f \ 2 



2] 2_jdi(xi,yi)p(x,y) 



x v y 



Vq(x) 



n-l 



Xj+l:n >7+l:n k=i+\ Xj \ >•,■ 

n-l 

fc=i+l 

Similarly 

/ \2 

Summing all these inequalities gives the announced tensorisation formula. 
The proof for and JT is identical and left to the reader. 



4. Displacement convexity property of the entropy. 

Using the weak transport cost defined in the previous section, we can now derive a displacement 
convexity property of the entropy on graphs. More precisely, we will derive such a property for 
the complete graph. Then we will prove that our definition of vf allows the displacement convexity 
to tensorise. As a consequence, we will be able to derive such a property on the ^-dimensional 
hypercube. 

4.1. The complete graph. Consider the complete graph K n , or equivalently any graph G equipped 
with the Hamming distance d(x, y) = 1 x *y (i n the definition of the weak transport cost). Recall the 
definition of V\ given in ( 12.41 ), and that we proved, in Section 12.5.1 1 that = (1 - t)vo + tv\ for any 
choice of coupling n. Then, the following holds. 

Proposition 4.1 (Displacement convexity on the complete graph). Let vo ,Vi, P £ P(K n ) be three 
probability measures. Assume that vq, V\ are absolutely continuous with respect to p. Then 

H(v t \p) < (1 - t)H(v \p) + tH( Vl \p) - (rkvilvo) + Tkvolvi)) , Vf e [0, 1], 

where v t - (1 - t)vo + tv\. 

Proof. Our aim is simply to bound from below the second order derivative of t \-> F(t) := H(v t \p). 
Denote by fo and f\ the respective densities of vq and v\ with respect to p. We have 



Fit) = log ((1 - t)f Q + tf{)(a - O/o + tfi) dp. 
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Thus F'{t) = / log ((1 - 0/o + t/i) d(v - vi ). In turn, 

p"( t \ - f i/o ~ /i) 2 j _ c t/o ~ d + f \i± ~ ^ o -^+ ^ 

J (l-t)/o + t/i ^"J (l-O/o + ^i ^ J (l-t)/o + f/i ^ 



-J ~l^ dfi + ] -^ dfX = J [ l -fol fodfX + J 



, _ fo 

fl 



2 

fi dfi 

+ 



= 72(vi|vd) + Ti(vo|vi), 

where, in the last line, we used Lemma 13.31 As a consequence, the function G: f h F(t) - 
j (7~ jOi|vo) + T j(v |vi)) is convex on [0, 1], so that G(t) < (1 - f)G(O) + ?G(1) which gives pre- 
cisely, after some algebra, the desired inequality. □ 

Remark 4.2 (Pinsker inequality). As an immediate consequence of the previous proposition, we 
will derive Csiszar-Kullback-Pinsker inequality ( 11401 l23l 0). Recall the notation of the proof of 
Proposition \4.1\ Applying Cauchy-Schwarz yields 

F"(t) = J — === ] 4" / (Vd-0/o + ?/i) 2 ^>(J l/o-/il^) =llv -vill 2 rv . 

Hence the map G : 1 1-> F(/) - — ||vo - vi||^ y is convex on [0, 1] so f/it?? 

(4.3) H(v,|ai) < (1 - t)H(v \p) + tH{v y \p) - ^^IN - vil£ v , V? e [0, 1]. 

Inequality 04.31 ) « a reinforcement of the well known Csiszar-Kullback-Pinsker' s inequality (see e.g. 
IU Theorem 8.2.7]) which asserts that 

IN- vilify <2ff(vi|vb). 

Indeed, take p = vq together with the fact that H(v t \p) > 0, and then take the limit t — » in (14.31 ) to 
obtain the above inequality. 

Csiszar-Kullback-Pinsker 's inequality, and its generalizations, are known to have many applica- 
tions in Probability theory, Analysis and Information theory, see H551 Page 636] for a review. 

Now we compare the displacement convexity property of Proposition \4~T\ with ( 14.3I ). For the two- 
point space it is easy to check that the ratio 

T2(v\\vo)+T 2 {vq\v\) 

IN - Vilify 

is not uniformly bounded above over all probability measures vq and v\. On the other hand, we claim 
that 

„s T 2 (vi|v ) +r 2 (v |vi) 1 

(4.4) > - , Vv , v\ 

llvo- vilify 2 

which implies that the result in Proposition \4.1\ is stronger than (14.3I ). up to a constant 2. We also 
provide an example below which shows that we cannot exactly recover (14.31) using Proposition \4.1\ 
Let us prove the claim, and more precisely that the following holds 

(4-5) 72(vi|v ) + 92(vo|vi) > - | nvo _ v JJ y > -||vo - n\\ 2 TV . 
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This is a consequence of Cauchy-Schwarz inequality, namely, we have 

>2 



U[fi-M + dn) (f[f Q -fi] + dfi) 
Tzfalvo) + T 2 (v \v l ) > — Jx — + 



vi(/i>/ ) v (/ >/i) 
Since ||v - Vi|| rv = 2 j[fi - f ]+dfi = 2(vi(/i > f ) - v (fi > fo)), we get 

~ , ~ ( , w . f (l + lb^nlkv)|| V0 _ Vl ||2 IIvq-v!!!^ 
T 2 (vi v ) + T j(vo vi) > inf - „ — = ,, - j ■ 

We now g/ve example that achieves equality in first inequality of (14.5I ). ^/ims confirming that 
Proposition \4.1\ can not exactly recover (14.31 ) : Let vq and v\ be two probability measures on the 
two-point space {0, 1} defined by vi(l) = vo(0) = 3/4 and vi(0) - vo(l) - 1/4. Then 

Hvo-vi|| rv = 2(vi(l)-vo(l)) = l, 

= , i i x (^id) ~ vq(D) 2 , (vo(0) - vi(0)) 2 

72(vi|v ) +T 2 (v \vi) = — + — = 2/3, 

vi(l) v (0) 



which gives the ( claimed) equality in (14.5I ). 

4.2. Tensorisation of the displacement convexity property. In this section we prove that if the 
displacement convexity property of the entropy holds on n graphs Gi = {V\,E\), . . . , G n = (V n ,E n ), 
equipped with probability measures p\,...,p n and graph distances d\,...,d n respectively, then the 
displacement convexity of the entropy holds on their Cartesian product equipped with fi\ ® • • • ® p.„ 
with respect to the tensorised transport costs I 2 and I) . As an application we shall apply such a 
property to the specific example of the hypercube at the end of the section. 
The next theorem is one of our main results. 

Theorem 4.6. Let (p 1 , . . . ,fi n ) e P (Vi) x • • • x P(V n ). Assume that for all i e {1, . . . , n} there is 
a constant Cj > such that for all Vo,Vi e P(Vf) there exists n = n l e II(vo,Vi) such that for all 
t e [0, 1] it holds that: 

H(y?\(/) < (1 - t)H{v \p!) + tH( Vl \p<) - QtQ. - t)(I 2 (n) + kin)). 

Then the product probability measure p = p l ® • • • ® p" defined on G = (V,E) = G\ □ • • • □ G n 
verifies the following property: for all vq, v\ e P(V) there exists n = ift^ e n(vo, vi) such that for all 
t e [0, 1] it holds that: 

ff(v» < (1 - t)H(v \p) + tHfyiM ~ Ct{\ - t){I { 2 n \n) + Tf{n)), 

where C - min,- C ( . The same proposition holds replacing I 2 (n) + I 2 (n) by J 2 (n) and I 2 \Jt) + 00 
by Jf{n). 

Proof. In this proof, we use the notation and definitions introduced in Section 13^21 Fix vo, v\ e P(V) 
and write the following disintegration formulas 

n-l 

VQ(x) = V^Xn) Y\ VQ(x k \x k+ Un), Vx = (Xl, . . . , X n ) € V 

i=\ 
n-l 

vi(y) = v'[(y n ) Y\ v\(yk\yk+\:n), Vy = (yi, . . . ,y„) e V, 
i=l 
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where we recall that Xk+i :n = fe+i , . . . , x n ) e Vk+i X • • • X V n . 

By assumption, for every x,y e V, there are couplings n" e P(V„ x V„) and n k ( ■ \Xk+i :n ,yk+l:n) 
P(Vk x Vk) such that 

n n e n(v^, V{) and n k ( ■ \x k+ 

and for which the following inequalities hold 

H(v? \n n ) < (1 - t)H(v%\(i n ) + tH(v'l\fi") - C n t(l - t)J 2 {n n ), 
Hfyk^-n^i,^ < (1 _ t )H(v k (-\x k+hn )\iu k ) + tH(4(-\y k+ i.. n )\fi k ) 
- C k t(l - t)R2<Jl k {-\x k+ v.n,yk+V.n)X 



where R 2 := h + h, v? '■= y *"> and v k - Xk+l: "' yk+l: " = yA-kwofcn*). 

Now, consider the Knothe-Rosenblatt coupling n e n(vo, vi) constructed from the couplings ; 
and n k {- \x k+ \- n , yk+i-.n), x,y e V an d denote by y, the path vf e P(V) connecting vo to v\. 

Let us consider the disintegration of y t with respect to its marginals: 

7t(z) = i]{Zn)yT X {Zn-\\Zn) 1 1 1 r/(ZllZ2, ■ ■ ■ ,Z n )- 

We claim that there exist non-negative coefficients a^(Xk+i M , yk+\:n,z k +i: n ) such that 

^ 1 (4(Xk+hn,yk+l:n, Zk+V.n) = 1 

and such that for all k e {1, . . . ,n - 1} it holds 

y?(-k* + i : „) - 2 v f* fIi " J '* fl *(-)of(jt!fc + i 3 .,yt + i 3 .,z* + i a ). 

Indeed, by definition and using the tensorisation property of vf y given in Lemma l2.10l it holds 

7t(z) = ^ v*' y (z)n(x,y). 

x,yeV 

So, using the fact that, according to Lemma l2. 101 vf y (z) = Yl"=i vf' yi {zd, we see that 

( \ n 



weV:Kfc„=Zfc„ 



x,yeV V«eV:MitM=Zfc„ / 



x,yeV i=k 



Xk:n,ytn i=k 

n 

= 2 v k i XMMn {z k ) \ \ vf'^ZiWixuydXM^yi+Un). 



Xk+l:n>yk+l:n 



i=k+l 
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From this it follows that 



Z Z v k,x k+lm ,y M -.n (zk) J j vfyXz^ixi^XM^yM^) 

h, . ueVlUk:n=Zk:n Xk+l:n,yk+l:n i=k+l 

T t {Zk\Zk+\:n) = ■ 



Z 7t(u) V f| r;^(^»^(.v / .v ; |.v / . 1:( ,.v,. 1:( ,» 

Xjfc+lin.yifc+la i=k+\ 
'•- Z v^' a * +1: " * y * i ' 1:n (zjfc)a^(jl5fc + i: n , yjt+l:n> 



Xk+hndk+hn 

using obvious notation, from which the claim follows. Similarly, for all z n £ V n , it holds y"(z n ) = 
v?(z„). 

Now, let us recall the well known disintegration formula for the relative entropy: if y e P(V) is 
absolutely continuous with respect to /j, then it holds 

n-\ 

(4-7) H(y\n) = H(y"\fi n ) + ^ Z H{ ^ { ' \^ + i:n)\^ k )y(z). 

k=\ zeV 

Applying (14.71 ) to y t , and the (classical) convexity of the relative entropy, it holds 

n-l 

H(y t \fi) = H(y?\fi n ) + Z Z " ^k + Un)\n k )y t (z) 

k=\ zeV 
n-l 

< H(v?\/u n ) + Z Z Z ^+l».y*+la.*+l»)^( 1 f J * fIi " J * fI: "l^r»(z) 

k=\ zeV X k+V.n< 
3%+l:« 

Now we deal with each term in the sum separately. Fix k e {1, - 1}. We have 

J] 2 arffe + i :n ,^ + i : „,z fc+ i : „)i/(y^ +1: " ,w+1: "|/)r t fe) 

zeV X k+Utf 

■ v *+l:n 



Z*+l:» *k+l:n- «eV: 

%+l:n "<r+l:n= z yfc+l:n 

= Z Z ^(^" rt+1 "' Vi+1: "l/) I [ Vf *fcy'(^,j/k/ + l:nj/ + l:«) 

Z*+l:<! "t+lJi' i=jfc+l 

= ^ ff(vf^ +1: "* +1: "|/) [ J 7j{x u yi\x M ; n y M ,n) 

x k+l:n> i=k+l 
Y t+l:n 

n 

= J] ^v*^ 1 "^ 1 "!^) f] tt^, yi\x i+hn y i+hn ) . 



1=1 



Therefore, 



n-l 



#(y f |//) < //<»") + Z Z H(^ W * 1M ^J). 



k=l x,y 
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Now, applying the assumed displacement convexity inequalities, we get 

n-l 



k=l x,y 



H(y t \fi) < (1 - 

H{V[\ii n ) + H{i\{-\y k+hn )\^ k )Ti{x,y) 



+ t 



- 0(1 - t) 



k=\ x,y 

n-l 



hW 1 ) + 2 J] R 2(* k (- \Xk + l:n,y k+ l:n)Mx,y) 



k=l x,y 



(1-0 



n-l 



Hiv'y 1 ) + J] Z H(y 0( ■ I** + I:»)l/)V (X) 



k=\ x 



+ t 



n-l 



k=l V 



- Q(l - t) 



n-l 



hw 1 ) + YjYi Ri ^('\ xk+i -- n ' yk+i -- n)) * {x ' y) 



k=\ x,y 



< (1 - t)H(v \fi) + M{ Vx \n) - Ct{\ - mf{fi) + ffift)), 

where the last inequality follows from the disintegration equality ( 14.71 ) for the relative entropy and 
from the disintegration inequality given in Proposition [T8] □ 



As an application of Theorem 14.61 we derive the displacement convexity of entropy property on 
the hypercube. 

Corollary 4.8 (Displacement convexity on the hypercube). Let p be a probability measure on {0, 1} 
and define its n-fold product p m on Q n - {0, 1}". For any vq, v\ € !P(Q„), there exists an € YI(vq, vi) 
such that for any t € [0, 1], 

t{\ - t) 



(4.9) 



H{V t \p m ) < (1 - t)H{v \p m ) + tH(v Y \p m ) - 



and there exists n e n(vo, vi) such that for any t e [0, 1], 

(4. 10) H{V\\p m ) < (1 - t)H{v \p m ) + tH( Vl \p m ) -2t(l- t)J^\n). 

Proof. According to Proposition 14. II for all vo, v\ € P({0, 1}), it holds 

H{v t \p) < (1 - t)H(v Q \p) + tH( Vl \fi) - ^--^ (Tkvilvo) + r 2 (v |vi)) , W e [0, 1], 



with v t — (1 - t)vQ + tv\. It is not difficult to check that the coupling n denned by (13.51) is optimal for 
both Tiiyx | vo) and ^(volv^. Since on the two-point space v t = vf is independent of n, the preceding 
inequality can be rewritten as follows: 

t(l - t) 



#(v» < (1 - Off (vol//) + tH{v x \n) 



(h{n) + I 2 {n)), We [0,1]. 



Therefore, we are in a position to apply Theorem 14.61 and to conclude that p m verifies the announced 
displacement convexity property (14.9I ). 
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Similarly, by Lemma [331 the displacement convexity property ( 14.31 ) ensures that 
H(v n t \p) < (1 - t)H(v \fi) + tH( Vl \n) - 2t(l - t)J 2 (n), Vf e [0, 1]. 
The result then follows from Theorem [ 



Let n be a coupling of vo, V\ e !P(f2„). By the Cauchy-Schwarz inequality, we have 
„ < \ 2 f \ 2 f 



J 2 ] W = Yj Z L >-^(.V..V) 

> -Wi(vi,v ) 2 . 
n 



> 



1 



2 Z 1 *^*'^ 
x, yen,, (=1 



2] d(x,;y>r(x,;y) 

(x,.yen„ 



We immediately deduce from Corollary 14. 81 the following weaker result. 

Corollary 4.11. Let p be a probability measure on {0, 1} and define its n-fold product p m on Q n = 
{0, 1}". For any v , V\ e P(Q.„), there exists n e n(v , vi) such that for t € [0, 1], 

HWfr*") < (1 - t)H(v \p m ) + tH{ Vl \n m ) - ^'^ Wiivuvof. 

n 

The constant 1 /n encodes, in some sense, the discrete Ricci curvature of the hypercube in accor- 
dance with the various definitions of the discrete Ricci curvature (see the introduction). 

Remark 4.12. Since T% is defined as an infimum, one can replace, for free, the term by 
*7~2 (vilvo) in A4.9I ). Moreover, if one chooses v - p m and uses that H(vf\/j®") > 0, one easily 
derives from (14.91 ) the following transport-entropy inequality: 



vv € nn n ). 



See HI 51 for more on such an inequality (on graphs). Note that the above argument is general and that 
one can always derive from the displacement convexity of the entropy some Talagrand-type transport- 
entropy inequality. 



5. HWI TYPE INEQUALITIES ON GRAPHS. 

As already stated in the introduction, the displacement convexity of entropy property is usually 
(i.e., in continuous space settings) the strongest property in the following hierarchy: 

Displacement convexity => HWI => Log Sobolev. 

Applying an argument based on the differentiation property of Corollary 12. 81 in this section, we derive 
HWI and log-Sobolev type inequalities from the displacement convexity property. 

We shall start with a general statement on product of graphs that allows to obtain symmetric HWI 
inequality from the displacement convexity property of the entropy. As a consequence, we get a 
new symmetric HWI inequality on the hypercube that implies a modified log-Sobolev inequality on 
the hypercube. This modified log-Sobolev inequality also implies, by means of the Central Limit 
Theorem, the classical log-Sobolev inequality for the standard Gaussian measure, with the optimal 
constant. 

Then we move to another HWI type inequality involving the already mentioned Dirichlet form 
fi/j(/> l°g/) based on Equation (12.161 ) available on complete graph. 



DISPLACEMENT CONVEXITY ON GRAPHS 



2<-> 



5.1. Symmetric HWI inequality for products of graphs. The main result of this section is the 
following abstract symmetric HWI inequality valid on the n-fold product of any graph. 

Proposition 5.1 (HWI). Consider G" for G = (V,E) any graph and p € P(V n ). Assume that p verifies 
the following displacement convexity inequality: there is some c > such that for any vq, v\ e P(V n ), 
there exists a coupling n e n(vo, vi) such that 



ff(v» < (1 - t)H(v \p) + tH{v x \p) - ct(l - tX&\it) + f?Xnj) 



We [0, 1]. 



Then p verifies 
(5.2) 



vo(x) 



for the same n e n(v , vi) as above, where Nj(x) = {z € V"; d(x, z) = 1 and Xi + Z;}. 

The proof of this result is given below. Before proving that, we derive a certain reinforced log- 
Sobolev inequality (see below for a brief justification of the name) in the discrete setting, and as a 
consequence, the classical Gross' log-Sobolev inequality on the continuous line, with the optimal 
constant. 

Choose v\ = p in (15.21 ) and denote by f(x) = vq(x)/p(x). Then, using the elementary inequality 
^fab < a/(2s) + sb/2, e > 0, we immediately get the following corollary. 

Corollary 5.3 (Reinforced log-Sobolev). Under the same assumptions of Proposition 15.71 for all 
f:V n —> (0, oo) with p{f) = 1, for all e < 2c, it holds that 

2 



(5.4) Ent^Cf) < 1 J] J Z ( lo §/W - lo ^f^ 



xeV" i=l IzeNi(x) 



f(x)p(x) - (c - -)T 2 (p\M ~ cT 2 (fp\p). 



Inequality (15.4b can be seen as a reinforcement of a (discrete) modified log-Sobolev inequality. 
The next corollary deals with the special case of the discrete cube. 

Corollary 5.5 (Reinforced log-Sobolev on Q. n and Gross' Inequality). Let pbe a Bernoulli measure 
on {0, 1}. Then, for any n and any f:Q n ^ (0, oo), it holds 

1 " 1 ~ 

(5.6) Ent^tf) < 2 Z Z [log /(x) ~ log - fliMM) . 

xeCl„ i=\ 

where cr,(x) = {x\, . . . , , 1 - Xj, Xi + \, . . . , x n ) is the neighbor of x - (x\ , . . . , x n )for which the i-th 
coordinate differs from that of x. 

As a consequence, for any n and any g : R" — » R smooth enough, it holds 

(5.7) Ent r „(^) < X - J \Vg\ 2 e g dy n 

where y n is the standard Gaussian measure on R" and \Vg\ is the length of the gradient of g. 

Remark 5.8. Note that the constant 1/2 in the above log-Sobolev inequality for the standard Gauss- 
ian is optimal, see e.g. |U Chapter 1]. 

We proceed with the proofs of Proposition 15.11 and Corollary 
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Proof of Proposition \5.1\ The displacement convexity inequality ensures that for all t e [0, 1], 
//(v |yu) < H{v Y \ii) c(f 2 \n) + P 2 >{n)). 



As t goes to 0, this yields 



d 



//(vol/i) < //(vi I//) - ^//(v»| f=0 - c(4 (7r) + f£\n)), 



where n e n(vo, vi). According to Corollary 12. 8 1 it holds 



|r(x, Z ,y)| 
|r(x,y)| 

|r(jc,z,y)| 
|r(x,y)| 



7r(x,y) 
7r(x,y) 



^ V V V /i y oW i y ofe)\ V j, jr(x,z,y)| 



v). 



According to ( 12.1 11 ), by induction onn > 1, we get that for all u,y e V", 

d(w,y)!_ 

v.-M 1 i 



\T{u,y)\ = — 



lir(« ; -,y ; )|. 



ir; ,^"/-.v/)! 

Applying this formula with u = ze Af,-(jt) for some i e {\,. . . ,n] and m = x, we get that for all y such 
thatz € lx,yj, it holds 



(5.9) 



Wx,z,y)\ = \T(z,y)\ = d(z,y)\ djpcuydl \T(z hyi )\ = d{x uyi ) |Tfa,y,-)| 
|T(jc,y)l |T(jc,y)l d(x,y)l d( Zi ,yi)l \r( Xi ,yi)\ d(x,y) \r(xi,yd\' 



using that xj = z.j for all i + j and the relations d(x,y) - 1 + c?(z,y) and d{xi,yt) = 1 + d(zi,yd- 
Therefore, when z € Ni(x), 



Plugging this inequality into the expression for -j t H(v^\/j.)\ t= o yields: 

4//(v» M) < Z Z [ Z K ^7T - 108 ^7t)1 Z 



xeV" i=l IzeNi(x) 



n(x,y) 
vq(x) 



vo(x) 



<jy y hog^-iog^ 



vo(x) 



where the last line follows from the Cauchy-Schwarz inequality. This completes the proof. 
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Proof of Corollary |53] By Corollary |48j Inequality 453} holds with c - 1/2. Observe that Nf(x) = 
{cr,(x)} where o~;(x) - {x\,..., Xi-\, 1 - Xj, Xj+i, . . . , x n ) is the neighbor of x = (xi, . . . , x„) for which 
the i-th coordinate differs from that of x. For e = 1, Corollary 15 . 3 1 gives 

i » i 
ErV-O) ^ ^ Z Z [ lo §/W - log/(^(x))]^/(x)/i^(x) - -T 2 (ffi\fi), 

,vefi„ i=l 

which is the first part of the corollary. 

For the second part, we shall apply the Central Limit Theorem. Our starting point is the following 
modified log-Sobolev inequality on the hypercube: 

1 " 

(5.10) EnV»(/) < 2 Z Z [ lo §/W - log/(^«)]+ f(x)fi m (x) 

jtsfi,, i=l 

that holds for all product probability measures on the hypercube O,, = {0, 1 }", for all dimensions 
n > 1. 

First we observe that, by tensorisation of the log-Sobolev inequality (see e.g. CD Chapter 1]), we 
only need to prove Gross' Inequality (15.71 ) in dimension one (n = 1). Then, thanks to a result by 
Miclo ll35l . we know that extremal functions in the log-Sobolev inequality, in dimension one, are 
monotone. Hence, we can assume that g is monotone and non-decreasing (the case g non-increasing 
can be treated similarly). Furthermore, for convenience, we first assume that the function g : R — > R 
is smooth and compactly supported. 

Let Hp be the Bernoulli probability measure with parameter p e [0, 1]. We apply (15.101 ) to the 
function / = e G " , with 



G„(x) = g 



x e D„, 



so that Ent^®n ( gG ") tends to Ent y (e g ) by the Central Limit Theorem. It remains to identify the limit, 
when n tends to infinity, of the Dirichlet form (the first term in the right-hand side of (15.10I )). Let x'v,- 
denote the vector (xi, . . . , x,_i,y;, x,-+i, . . . , x n ). Then, 

Z [G n (x) - G n {o-i{x))f + e G ^ x) fi p (Xi) = p[G n (x'l) - G^O)]^ 6 "^ 

-v,e{0,l) 

+ (i - p)[G n (j?o) - Gn&iyiltF'&v. 

Now, since 

Z" = i Xj - np Zjti xj - (n - \)p _ Xi 1 yi / 1 1 

V«P(i - p) # - Dp(1 - />) " aM 1 - p) + V^ 1 - p) & Xj ' V^T 



+ ^ ( V« - Vn - l) 



yjnp(l- p) yjp{\ - p)(y[n+ V« - l) V" V« - 1 

f Vp(i -p)(V«+ V^H") °(vn)' 
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by a Taylor Expansion, we have 

G ^D _ g^o) = -1= , f + 

y/np(l-p) { V(n-Dp(l-P)j V 

Setting = , it follows that 

V(« - 1)P(1 - P) 



2 [G„(x) - G.(^))^^ = g^!f^ + ( M . 



x,-6{0,l) 

Now, since all y ( (x)'s have the same law under pf", it follows that 

xeCl,, i=l .refl,, 



2 X[G n( x) - g^x))^^) =2 g^^^ prw + 0(4=) . 



The desired result follows by the Central Limit Theorem, then optimizing over all p e [0,1], and 
finally by a standard density argument. This ends the proof. □ 

5.2. Complete graph. Combining the differentiation property (12.161 ) together with the displacement 
convexity on the complete graph of Proposition 14.11 we shall prove the following result. 

Proposition 5.11 (HWI type inequality on the complete graph). Let p = l/n be the uniform measure 
on the complete graph K n . Then, for any f : V(K n ) — > (0, oo) with J fdp - 1, it holds 

Ent^/) < BfXf, log f) - i (f 2 (p\M + f 2 (Mp)) , 



where 



S„(/,log/) := X - ^ (/(y) - /(x))(log/(y) - log f(x))p(x)p(y) 

x,yeK„ 

corresponds to the Dirichlet form associated to the Markov chain on K n that jumps uniformly at 
random from any vertex to any vertex (i.e. with transition probabilities K(x,y) - /i(y) - \ I n, for any 
x,ye V(K n )). 

Proof. We follow the same line of proof as in Proposition O Fix/: V(K n ) -» (O.oo) with J fdp = 
1. By Proposition 14. II applied to v\ = p (which implies that H(v\\n) = 0) and vo = fp, we have 

H(v t \p) < (1 - t)H(v \p) - (r 2 (vi|v ) + r 2 (v |vi)) 

where v t - (1 - i)v$ + tv\. Hence, as t goes to 0, we get 

/Q 1 
/log/rf/i = ff(v |/i) < --H(v t \/j\__ - - (r 2 (vi|v ) + f 2 (vo|vi)) ■ 

The expected result follows from (12.161) . □ 

In the case of the two-point space, one can deal with any Bernoulli measure (not only the uniform 
one as in the case of the complete graph). 



DISPLACEMENT CONVEXITY ON GRAPHS 



33 



Proposition 5.12 (HWI for the two-point space). Let p be a Bernoulli-p, p € (0, 1) measure on the 
two-point space Q,\ = {0, 1). Then, for any f: Q\ — > (0, oo) with p(f) - 1, it holds 

Ent M (f) < S M (f, log/) - X - (f 2 {p\fp) + T 2 (fp\p)) 

where, 

£//, log/) = P {\- p)(f(l) - /(0))(log /(l) - log/(0)). 
Proof. Reasoning as above, Proposition 14. II applied to vi = p and vo = fp, implies 

Eny/) < --H(v t \p\__ - - (T 2 (p\fp) + T 2 (fp\p)) , 

where v t = (1 - f)fp + tp. Set q = 1 - p. Since #(v,|yu) - [(1 - t)f(0)q + tq\ log[(l - 0/(0) + f] + 
[(1 - t)f{\)p + tp] log[(l - 0/(1) + t], it immediately follows that 

d 

-H{v t \p\__ = q{\ - /(0))log/(0) + q{\ - /(0)) + p{\ - /(l))log/(l) + p{\ - /(l)) 

= q(l - /(0)) log /(0) + p(l - /(l)) log /(l) 

where the second equality follows from the fact that p + q - 1 = p(f) = qf(0) + pf{l). Using again 
that 1 = qf(0) + pf(l), we observe that 

q(l - /(0)) log /(0) = pq{f{\) - /(0)) log /(0) 

and 

p(l -/(D) log /(l) - - M (/(l)-/(0)log/(l), 
from which the expected result follows. □ 

6. Prekopa-Leindler type inequality 

In this section we show by a duality argument that the displacement convexity property implies a 
discrete version of the Prekopa-Leindler inequality. (This argument was originally done by J. Lehec 
|[25l in the context of Brascamp-Lieb inequalities.) Then we show that this Prekopa-Leindler inequal- 
ity allows to recover the discrete modified log-Sobolev inequality ( 15.101 ) and a weak version of the 
transport entropy inequality of Remark 14. 121 

Let us first recall the statement of the usual Prekopa-Leindler inequality. 

Theorem 6.1 (Prekopa-Leindler EU|42j[26l). Let n e W and t e [0, 1]. For all triples (f,g,h) of 
measurable fiinctions on R n such that 

h((l -t)x + ty)>(\- t)f(x) + tg(y), Vx,y € R", 

it holds 



Using the identity (with || • || denoting the Euclidean norm), 

-||(1 - t)x + ty\\ 2 2 = (1 - 0-^ + t-+ - t{\ - x,ye R , 

one can recast, without loss, the preceding result into an inequality for the Gaussian distribution. 



34 NATHAEL GOZLAN, CYRIL ROBERTO, PAUL-MARIE SAMSON, PRASAD TETALI 

Theorem 6.2 (Prekopa-Leindler: the Gaussian case). Let y n be the standard normal distribution on 
W and t € [0, 1], For all triples (f, g, h) of measurable functions on R" such that 

(6.3) h{{\ - t)x + ty) > (1 - t)f{x) + tg(y) - ^y^ll* - yllj, Vx,y e R", 

it holds that 

J e h(z) y n (dz)>(f e™ 7n {dx)\ (J ^ 7n (dy)\ . 

The next result shows that a discrete Prekopa-Leindler inequality can be derived from the displace- 
ment convexity property of the relative entropy. 

Theorem 6.4 (Prekopa-Leindler (discrete version)). Let n e W, t € [0, 1] and p e P (V n ). Suppose 
that p. verifies the following property: for any vq,v\ e P(V n ), there exists a coupling n e n(vo, vO 
such that 

(6.5) ff(v» < (1 - t)H(v \p) + tH{v x \n) - ct(l - 04"V). 
If(f, g, h) is a triple of functions on V n such that: Wx e V", Vm e P(V) , 

(6.6) Jj h(z) v t - y (dz)m(dy) > (1 - t)f(x) + t J g(y) m{dy) -ct(l-t)^ |J d( Xi ,yd m(dy)j , 
then it holds 

J e h(z) p(dz)>lf e m p(dx)\ (f e^p(dy)j . 

Proof Let n e N, f,g,h : V n ^ R, p e P(V"), t 6 [0, 1] and c e (0,oo) satisfying the hypotheses 
of the theorem. Given vo, v\ e P(V"), let n be such that (16.5b holds and let p be such that n(x,y) = 
v (x)p(x,y), x,ye V n . 

Then, integrate (16.6b in the variable x with respect to vq, with m(y) = p(x,y), so that (recall (12.41 )) 



J~ hdv n t >{\-t) J fdv + t J gdv Y - ct{\ - t)I ( f(n). 

Together with (16.5I ). we end up with 

J hdv n t - H{V t \p) > (1 /rfv - //(v |yu)^ + ?(J% rfvi - H(vi|ai) 

The result follows by optimization, since by duality (for any a: V" i-> R) , 

sup < I or cfra - H(m\p) > - log I 
me?>(V") I J J J 

This ends the proof. □ 

An immediate corollary is a Prekopa-Leindler inequality on the discrete hypercube. 

Corollary 6.7. Let p be a probability measure on {0, 1}, n € N* and t € [0, 1], For all triple (f, g, h) 
verifying (16.61 ) with c = 1/2, it holds 

J e h(z) p m (dz)>[f e /(J VV*)) (J e g(y) p m (dy) 
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It is well known that Talagrand's transport-entropy inequality and the logarithmic Sobolev in- 
equality for the Gaussian measure are both consequences of the Prekopa-Leindler inequality of The- 
orem [6]2] [4]. Similarly the discrete version of Prekopa Leindler inequality implies the modified log- 
arithmic Sobolev inequality induced by Corollary [53] and the transport-entropy inequality associated 
with the distance T 2 of Remark |4.12| 

Corollary 6.8. Assume that the following Prekopa-Leindler inequality holds: for all t e (0, 1), for 
all triples of functions (f,g, h) on V n such that: Wx € V", Vm e ( P(V n ), 



JJ h(z)v x ' y (dz)m(dy)>(l-t)f(x) + t J g(y) m(dy) - ct{\ 
it holds that 

J e h ®p(dz)>if e^ x) p(dx)) (f e^ p{dy)\ . 
Then one has, for all functions h: V — > R, 

1 " \ 

Ent^) < - J] J] £ (Kx)-Kz» 



yi)m(dy)\ , 



e mx >p(x). 



xeV" (=1 LzeNi(x) 

and for all probability measures v, absolutly continous with respect to p, 
(6.9) cT 2 Qi\v) < H(v\p), 



(6.10) 



cT 2 (v\p) < H(v\p), 



Proof. We first prove the transport-entropy inequalities ( 16.91 ) and H6.101 . Let k be a function on V" 
(necessarily bounded, since V is finite). We apply the discrete Prekopa-Leindler inequality with 
h = 0,g = -(1 - t)k and / = tQk, with Qk defined so that the condition ( 16.61 ) holds: for all x e V'\ 



d{xi,yi)m(dy) 



1/(1-0 



< 1. 



Therefore, one has for all t e (0, 1), 

(J e tQk dpYlf e-V-^dp. 
As t goes to 1, we get for all functions k on V", 

J e Qk dp < e^ k \ 

and this is known to be a dual form of the transport-entropy inequality ( 16.91 ) (see |[T5l ). Similarly as t 
goes to 0, we get for all functions k on V", 

J e- k dp < e^Q k \ 

which is a dual form of the transport-entropy inequality (16.101 ). 
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Let us now turn to the proof of the modified discrete logarithmic Sobolev inequality. Fix a bounded 
function h : V" —> R and choose g = th and f = h + tR t h with R t h designed so that condition ( 16.6b 
holds. Namely, for all x e V", 



R t h{x) = inf j^pr^ (jjf h(z)v x t ' y {dz)m{dy) - (1 - t)h(x) 



J h(y) m(dy) + c ^ | J d(xi,yj) m(dy) 



where the infimum runs over all probability measures m e P(V n ). Then the Prekopa-Leindler in- 
equality reads 

J e h dn>lj e h e tR ' h dfi\ (j"< 



e th dn\ , 



which can be rewritten as 



,0-1)11 



dfj.) 



l/d-0 



with dfih = ygr^ d+i- Letting t go to 0, we easily deduce (leaving some details to the reader) that, 

f(l\mMR t h)e h dfi< J e h dfilog J e h dfi. 



This can equivalently be written as 



Ent 

We conclude using the following claim. 
Claim 6.11. For all xeitvc have 



(h - liminf R,h)e 'dfi. 

r-»0 



1 " f 

h(x) - liminf R,h(x) < — V V (h(x) - h(z)) 

t->0 4C *77* .. 



i'=l IzeNiix) 



□ 



Proof of Claim [67T71 By a Taylor expansion and by Proposition 12.71 for all x,y € V" , 

J h(z)v*' y (dz) = v*' y (h) = v x Q y (h) + td{x,y)vf (V^ti) + o(t) = h(x) + td(x,y)V x > y h(x) + o(f), 

with the quantity o(f) independent of y since h is bounded. Now, from the definition of the sets Nj(x), 
i e {1, . . . , n} and using the identity ( 15.91 ), one has 

V^(x) = 7 ^— - ^ (h(y + (x)) - h(x)) = Y (h(z)-h(x)) lTi 

yero.v) zeV„, z ~x 

d(xi,yd\T(xi,Zi,yi)\ 



\nx,y)\ 

n 



lr(*,30l 



;=1 zeW,W 



cfCx,^)!^,^)! 
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Therefore 



h(x) - R t h(x) - sup 



in 




i=l zeNi(x) 



J] {h(x)-h{z))d(xi,yd 



\T{xi,Zi,yd\ 

\r(xi,yd\ 



m(dy) 



<J]supiv ^ (h(x)-h(z)) 
i=l [ [ z eNi(x) 

The claim follows by letting t go to 0. 




□ 
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